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1 • INTRODUCTION: 

Use  of  Fisher's  linear  classifiers  is  very  common  in  discri- 
minant analysis.  The  linear  classifiers  are  optimum  when  the 
underlying  distributions  are  gaussian  with  common  covariances. 

In  other  cases,  they  are  not  optimum,  but  their  simplicity 
compensates  for  the  loss  in  performance. 

The  probability  of  error  is  the  very  key  quantity  in  pattern 
recognition  and  therefore  considerable  literature  exists  regarding 
its  calculation  and  estimation.  As  discussed  in  Fukunaga  (1972) 
there  are  two  cases  of  interest  as  far  as  the  estimation  of  the 
probability  of  error  is  concerned.  First,  estimation  of  the 
probability  of  error  when  the  classifier  is  given  and  a sample 
of  N observations  is  also  given.  This  problem  is  considerably 
easy  and  as  shown  by  Highleyman  (1962)  [see  Fukunaga  (1972, 
page  145-6)]  an  unbiased,  minimum  variance  estimator  is  given  by 
the  ratio  of  number  of  misclassif ied  observations  to  N [when  a 
prior  probabilities  of  classes  are  not  available;  if  these  prior 
probabilities  are  known,  slightly  better  estimator  can  be  obtained]. 
The  above  is  a general  result,  i.e.  applies  to  all  possible 
classifiers.  Second,  when  a sample  consistinq  of  N observations 
is  available  and  the  classifier  has  to  be  estimated,  along  with 
the  probability  of  the  error  of  misclassification.  Obviously,  the 
esimator  of  the  probability  of  misclassification  depends  on  the 
given  class  distributions  and  the  classifiers  to  be  used.  In 
this  paper  we  will  consider  the  linear  classifiers  and  the  gaussian 
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distributions  only. 

At  this  point,  the  following,  general  result,  of  Hills  (1966) 
is  worth  recalling.  Suppose  e(  02^  denotes  the 

probability  of  error,  01  are  the  parameters  of  the  distribution 
used  to  design  the  Bayes  classifier  and  ©2  are  the 
parameters  for  the  distributions  used  to  test  the  performance  of 

A 

this  Bayes  classifier.  If,  for  = ®2»  ®N  denotes  the 

estimator  of  the  parameters  based  on  a sample  of  size  N and  if 
Elc(  © , ©N)  1 = e(  ©,  ® ) 

then 

E(c(®,  ©)]<_£(©,  ©) 

Thus,  if  the  sample  is  used  to  obtain  the  classifier  and  the 
estimator  of  the  probability  of  misclassif ication,  then  the  method 
provides  an  optimistic  estimator  i.e.  the  expected  value  of  the 
estimator  is  smaller  than  the  true  value.  Following  Fukunaga  (1972) 
we  call  this  method  the  C-method.  Thus,  in  the  C-method  a given 
sample  is  first  used  to  obtain  the  classifier  and  then  is  used  for 
testing  its  performance. 

In  another  approach  the  given  sample  can  be  used  to  obtain  the 
classifier  and  a fresh  sample  to  test  its  performance.  Among  many 
possibilities  available  to  us,  which  use  this  approach,  the  leaving- 
one-out  method  (Lachenbruch  (1965)]  is  rather  economical.  In 
this  method,  the  sample  of  N observation  is  divided  in  two  parts 
consisting  of  (N-l)  and  1 observations  respectively.  The  first 
set  is  used  to  construct  the  classifier  and  the  remaining  observation 
is  used  for  testing.  The  method  is  repeated  N times.  Throughout 


this  paper,  we  will  denote  this  method  by  the  U-method. 

Fukunaga  and  Kessel  (1971)  showed  that  for  any  random  sample 
from  the  gaussian  distributions,  if  an  observation  of  the  sample 
is  misclassif ied  by  the  C-method  then,  it  will  be  misclsssif ied 
by  the  U-method,  but  the  converse  need  not  be  true.  Thus,  for 
any  sample  the  estimate  of  the  error  probability  will  be  smaller 
when  the  C-method  is  used  compared  to  the  U-method. 

In  this  paper  we  consider  the  C-method  and  the  U-method  of 
estimation  of  the  probability  of  error  for  the  linear  classifiers 
and  the  gaussian  distributions.  Our  aim  is  to  evaluate  the 
mean  square  errors  of  estimators  for  the  purpose  of  comparisons. 

Let  us  remind  ourselves  that  for  the  purpose  of  comparisons  of 
two  estimators  mean  square  error  is  a good  measure.  Until  recently 
these  estimators  were  compared,  mostly,  using  their  expectations. 
Although  there  are  several  numerical  studies,  thoretical  results  are 
only  recent.  See  Sorem  (1971,  1972),  Dasgupta  (1974).  The  results 
of  Moore,  Whitsitt  and  Landgrebe  (1976)  are  also  of  interest  to 
a realted  problem.  Fukunaga 's  (1972,  page  159)  empirical  study 
is  of  immediate  interest  to  us. 

In  Section  2 some  basic  notations  are  introduced.  In  Section  3 
we  present  Foley's  (1972)  results  regarding  the  expectation  of  the 
estimator  of  the  probability  of  error  for  the  C-method  when  the 
covariances  are  known  and  the  results  of  John  (1961)  which  are 
applicable  to  the  U-method.  In  Section  4,  the  C-method  and  the 
U-method  are  considered  when  the  common  covariances  are  unknown 
»nd  the  expected  values  of  the  estimators  are  considered.  In 
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Section  5 we  consider  the  variance  of  the  estimator  for  the 
error  probability  by  the  C-method  and  finally,  in  Section  6 the 
same  is  evaluated  for  the  U-method.  In  both  Sections  5 and  6 
the  covariances  are  assumed  known.  Computer  programs  to  evaluate 
these  variances  are  presented  in  the  Appendix. 
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2.  NOTATIONS  AND  PRELIMINARIES: 


We  consider  the  two  class  pattern  recognition  problem.  These 

two  classes  are  denoted  byC?^  and  respectively.  The  corresponding 

p-variate  gaussian  densities  are  denoted  by  4>  ( V1  ^ » 2)  an<3  'MUj'S) 

respectively.  For  the  sake  of  convenience  we  assume  that  each 

class  has  equal  a-priori  probability.  A random  sample  of  size  m 

from  the  first  class  is  denoted  by  X. ,...,X  and  similarly  another, 

1 m 

independent  random  sample  from <S2  by  Y^,...,Y  . 

An  observation  X is  classified  as  belonging  to  the  class if 

(2.1)  X,5'"1(X-y)  - i-(X+Y)  ' p1(X-Y)  >0 

when  only )X ^ and are  unknown,  2 is  assumed  known,  X and  Y denote 
the  sample  means,  i.e. 

-1  m - -1  n 

(2.2)  X = m Y X. , Y = n J Y.  . 

i=l  1 j=l  3 

If  J is  also  unknown,  then  the  above,  linear  classifier,  has  to  be 
modified  and  X is  classified  toC^  provided 

(2.3)  X'  S_1 (X-Y)  - |(X+Y) ' S_1 (X-Y)  >0 

where  X and  Y are  defined  by  (2-2)  and 

« iti  rn 

(2.4)  S = (m+n-2)"1  { £ (X. -X) (X. -X) ' + J (Y . -Y) (Y, -Y) ’ } . 

i=l  11  j=l  3 J 

Probabilities  of  misclassif ication  are  given  by 

(2.5)  Pij  = PfXeCjlXe^]  , i / j,  i,j  = 1,2  . 

Our  aim  is  to  study  the  estimates  of  p^'s  using  classifiers  defined 
in  (2.1)  and  (2.3).  We  recall,  at  this  time,  that  the  classifiers 


are  themselves  random.  Due  to  symmetry  of  the  problem,  xt  is 
sufficient  to  consider  estimates  of  only  one  of  the  P12  or 

P21* 
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3.  EXPECTED  VALUES  OF  THE  ESTIMATES  OF  THE  PROBABILITY  OF 


ERROR,  l KNOWN: 


In  this  section  the  C-method  and  the  U-method  are  used  to 
obtain  estimates  of  the  probability  of  error  when  the  common 
covariance  matrix  is  known.  The  expected  values  are  obtained  for 
these  estimators.  It  can  be  easily  seen  that  when  \ is  assumed 
known,  we  can  generalize  these  results  to  the  case  when  the  two 
classes  are  allowed  to  different  covariance  matrices.  Moreover, 
without  loss  of  generality,  we  can  assume  £ = I.  Thus,  in  the 
following  sections  the  common  covariance  is  taken  to  be  the  identity 
matrix. 

The  results  of  Section  3.1  were  obtained  by  Foley  (1972),  our 
treatment  is  only  slightly  different.  Section  3.2  contains  results 
obtained  by  John  (1961) . 


3.1  The  C-method. 


Let 


, 1 if  X.  is  classified  as  an  element  of  class  ^ 


(3.1)  Ti  = l 

1 0 otherwise 

when  the  classifier  (2.1)  is  employed.  Then,  an  estimate  of 


is  given  by 
(3.2) 


-1  r 
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Clearly,  by  symmetry  in  X's 


(3.3) 


E^)  = E(Tl)  = P[  (Xj^-ifX+Y)  ) • (X-Y)  < 0] 

= E-  ^ [P{ (X1-j(X+Y) ) ' (X-Y)  < 0 | X, Y} ] . 


But, 


can  be  easily  verified  that  the  conditional  distribution  of 


X1#  given  X,Y  is  gaussian  with  mean  vector  X and  covariance  matrix 
{(m-l)/m}l.  Thus,  the  conditional  distribution  of  (X^-jfX+Y) ) * (X-Y) , 
given  X,Y  will  also  be  univariate  gaussian  with  mean  1/2 (X-Y) 1 (X-Y) 
and  variance  ( (m-1 ) /m) (X-Y) ’ (X-Y) . But,  Z = (X-Y) • (X-Y)  is  itself 
a random  variable  satisfying  the  noncentral  chisquare  distribution 
with  p degrees  of  freedom  and  noncentrality  parameter 

mn 


. 2 UU I / 

X = in+n  (>W 


, . mn  . 2 

(wru2)  = in+n  6 


where  6 is  the  usual  Mahalnobis  distance  between  the  two  gaussian 

distributions.  Let  g(z;X)  denotes  the  density  of  Z,i.e., 

2 

, ..  : -<V,  x2.k  1 1 ^ - 1 

9<*!  ’ =Joe  2 r(Ej^)  ^ e z 


Thus,  from  (3.3)  and  the  above  discussion, 

oo  0 

(3.4) 


E(e,)  = f[  f <Mx;  X z,  z)  dx]  g(z,X)dz 

0 -<*>  ^ 


where  <|>(x;y,o  ) denotes  the  univariate  gaussian  density  with  mean  u 
2 

and  variance  a . At  this  stage  the  order  of  integration  and 
summation  can  be  interchanged  and  also  the  double  integration  can 
be  evaluated  after  changing  to  polar  coordinate  system.  The  final 
result  is  obtained  in  the  form  of  an  infinite  series  given  below. 
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(3.5) 


I (p+2k ,m+n) 


' , a2,  .a2."  n^f*D 

- I exp[ — jl  (-7)  prr — = — 5- 
k=o  1 T(|)  r(k+|: 


where 


H M-l 

I (M, N)  = I Sin  e d9 


A = tan-1 /2 (N-l) 


3.2  The  U-method: 

In  this  case,  we  first  obtain  the  linear  classifier  using 
the  (N-l)  observations  and  then  test  the  classifier  on  the  remaining 
observation.  The  process  is  repeated  N times.  An  estimate  of  p2x 
will  be  given  by 


m 

(3.6)  e2  = l T?/m 

i=l  1 

where 


(3.7) 


* 

T. 

1 


1 


if  X.  is  misclassif ied  to  class 
1 


2 


*-  0 otherwise. 

where  classifier  (2.1)  is  used  after  replacing  X by  and 

X equals  X^;  X^  «=  (m-l)  [mX-X^]  . Once  again,  due  to  symmetry 

of  the  problem 

E(e2)  = E(Tj)  = P[(Xx-i(X(i)+Y)) ' (X(i)-Y)  < 0)  . 


However,  the  above  is  the  same  probability  which  was  obtained 
by  John  (1961).  The  similarity  follows,  as  soon  as  we  recognize 
the  fact  that  X^ , X^  and  Y are  all  independently  distributed. 
Thus,  from  equation  (77)  of  John  (1961)  we  get 
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(3.8)  E(e2) 


-(X.+X_)  » r xfx! 

e 1 2 l l [ 


r=0  s=0 


rls! 


- I, 


r(l-P) 


(ip+r,  jp+s)} 


oo  oo 


xfx8 


♦ l l ft;!1!  (iP+sjP+r)], 

__n  rlS!  ^(l+p)  * 


r=0  8=r+l 


where 


6 (m-l)n  , 1_ 

4 ( 1+P)  1 


Tl 


(m+n-l)z  lm+n-1+4 (m-l)n}' 


X„  = 


6 (m-l)n  , 1 

4(i-PT  [ r + 


o J 

(m+n-lj  lm+n-1+4 (m-1) n} 


m-n-1 


(m+n-1) {m+n-1+4 (m-1) n} 


and 


6 


2 . 

18, 


as  defined  above, 


the  Mahalnobis  distance. 
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EXPECTED  VALUE  OF  THE  ESTIMATE  OF  THE  PROBABILITY  OF  ERROR 


J1  UNKNOWN: 

In  this  section  the  results  of  the  previous  section  are 
extended  for  the  case  when  J is  unknown.  In  section  4.1  the 
C-method  is  considered  and  in  4.2  the  U-method.  In  the  section 
4.2  it  is  shown  that  no  new  result  is  needed  since  results  of 
Okomoto  (1963)  apply.  Results  of  the  section  4.1  are  new. 

4 . 1 The  C-method: 

Since  £ is  assumed  to  be  unknown,  it  must  be  estimated  and 
S defined  in  (2.4)  provides  an  estimate.  An  estimate  of 
is  clearly  given  by 

-1  m 

(4.1)  e = m l V. 

J i=l  1 

where 

,1  if  {Xi-|(X+Y)}'  S_1 (X-Y)  < 0 

(4.2)  Vj,  =< 

l 0 otherwise 

i = l,2,...,m.  In  this  section  we  obtain  an  expression  for  the 
expected  value  of  e^.  The  expression  is  then  evaluated  by  means 
of  numerical  integration. 

A conditional  argument  is  employed  to  simplify  the  expected 
value  of  c3*  Since  X3,...,Xm  are  independent  and  identically 
distributed 


E(e3) 


(4.3) 


-1 

m l E (V. ) = E (V. ) 

i=l  1 

P[{Xl-i(X+Y) }'  S_1 (X-Y)  < 0] 

1 - P(X|  S"1 (X-Y)  > |(X+Y) ' S_1 (X-Y) ] 

1-E [P{X'S'1(X-Y)>i-(X+Y) 'S_1(X-Y)  |X,Y,S}]  . 

X,Y,S  1 Z 


Firstly  we  consider  the  inner  conditional  probability  in  (4.3)  and 
secondly  we  obtain  its  expected  value  with  respect  to  X,  Y,  S. 
Results 


Theorem  4.1.  Let  = XJ  S 1 (X-Y) . 

(i)  The  conditional  distribution  of  given  X,  Y,  S is 

1 - 2 m+n-5 

j ,m-i„  %-yri  m {urui)  2“ 

'd(-irB ii}  {1”5=t-b — } 


(4.4)  f4 (Ul|X,Y,S)= 


0 otherwise 


11  - 2 
(u  -u  ) 

for  -A-  — \ 1 < 1 

m-1  b. 


J11 


where  d,  u^  and  B^  are  given  by  (4.12)  below. 


k"2 


(4.5)  (ii)  E (V. ) = k - k P[W>k"2)  - l C(j)  / / 

1 z 1 j = 0 0 0 


2.=^  ^ 


_(j+m±nzl) 


{ (1-t  ) w ‘ (1+w)  ‘ )dt  dw 

where  k is  given  by  (4.16),  C(j)  by  (4.18)  and  W where 

2 

is  a noncentral  chisquare  random  variable  with  p degrees  of 
freedom  and  noncentrality  parameter 
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>v  n+n  J'1  <uru2>  • 

2 

and  2 *s  a chisquare  random  variable  with  (m+n-p-1)  degrees  of 
freedom. 

o -i  "} 

Proof:  (i)  Set  A = (m+n-2)S,  X « (m-1)  j X. 

i=2  1 

o m 0 0 n _ 

A = j (X.-X) (X.-X) • + l (Y.-Y) (Y.-Y) ’ . 
i«2  11  j=l  1 1 

o o 

Then,  X. , X and  A are  statistically  independently  distributed  and 

_ o o 

the  density  of  X^  is  x *s  ( P j. ' 1 ) 1)  and  of  A is 

o o 

Wishart  (m+n-3,£).  Consequently,  the  joint  density  of  X^,  X and  A, 
denoted  by  f^,  is 

(4.6)  f ^ (x^,x,A)  = c1(exp-j(x-p1 ) ' J-1  (x-p^)  } (exp-^^fx-Uj^)  1 J_1  (x-Pj^)  } 

m+n-4-p 

o — ~5  i o 

{ | A J exp  -j  trace  J A) 

where  the  constant  c^  is  given  by 

E -i.  -E  _i  m+n-3^ 

(4.7)  cx  = ( (211)  2!£|  2 } { (211)  2|(m-l)“1  I|  2 } (2  2 

-1  m+n-3 

^Ei^zii  m • 

0 o 

Since,  X,  X,  A and  A are  related  by  the  following  equalities 

0 -1  - 
X = (m-1)  (mX-X1) 

0 -1  - - 

A - A - m(m-l)  (Xj-X)  (X^X)  * 
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the  joint  density  of  X^,  X and  A can  be  easily  obtained  from 

(4.7)  by  standard  procedures.  Moreover,  we  also  know  that  the 
random  variables  X and  A are  statistically  independent  and  that 

X follows  a gaussian  density  and  the  random  variable  m 

is  distributed  as  Wishart  (m+n-2,J).  Thus,  the  conditional  density 
of  X^  given  X and  A is  obtained  by  taking  the  ratio  of  joint 
density  of  X^,  X and  A and  of  X,  A.  This  conditional  density 
simplifies  to: 

_1  m+n-p-4 

f c2|A|  2 {l-Q^x^}  2 for  < 1 

(4.8)  f^Xjji^A)  = < 

l 0 otherwise 


where 

_E  E 

(4.9)  c2  - TT  2 tmtm-l)"1  )2  CD— ”2^)  > 

Ql  (xi)  = m(m-l)  (Xj-x)  ' A ^Xj-x) 


In  order  to  obtain  the  conditional  density  of  given  X,  Y and  S, 
we  first  take  any  nonsingular  square  matrix  T whose  first  row  is 
given  by  (X-Y)'S  *,  and  make  the  transformation 


U = T X1  . 

Also  denote  TAT'  by  B and  TX  by  U.  Then,  clearly  is  the  first 
element  of  the  vector  random  variable  U.  From  (4.8)  the  conditional 
distribution  of  U is  easily  obtained  and  is  given  by 


m+n-p-4 


(4.10)  f , (U|x,y, A) 


c2|B|  2{1-Q2(u)} 


0 otherwise. 


for  Q2 (u)  < 1 


where  c2  is  given  by  (4.9)  and 

Q2 (u)  = m(m-l)  1 (u-u)'  B ^ (u-u)  . 

In  order  to  obtain  the  conditional  density  of  it  remains  to 

integrate  out  the  last  p-1  components  of  U.  To  perform  this 


integration,  we  first  partition. 

r"il 

riill 

B11  I B12 

(4.11) 

U - 

, U = 

9 

B = 

1 __ 

4 

U* 

B21  B22 

Then, 

. 2 

Q2(u)  = 

(Uj^-U 

1> 

+ { (u*~ u* ) 

- B21BU 

(urGi 

) } ' (B22-B21 

B11 

{ (u*-u*) 

- B21BU 

(V“i 

)}, 

I ® I 

and 


= Q3  (u^)  + Q^u^u*) 
Bll!B22  " B21B11B12*1 


(say) 


Q2(u)  < 1 if  and  if  Q4(u1#u*)  <_  1 - Q^(u^) 

Thus,  the  integration  of  f^,  with  respect  to  the  vector  u*  can  be 
performed  easily,  using  the  identity 


M=E  _1  E _! 

/ (l-t't)  7 dt  =TT2  T(^)  (Hj)  > 

(t: (t’t<l) } 

and  the  linear  transformation  1 

t - Hl-Q3  (ux) } <b22-b21b^b12)]  2 { (u*-u*)  - b21b^(u1-G])  \ - 

Therefore,  the  conditional  density  of  given  X,  Y and  S' 
is  given  by 


f4(Ul  X,  Y , S)  = | 


- m+n-5 


</2{>  - * ^ } 


„ m (ul_ul)  „ , 

for  SFT  i 1 

otherwise , 


where 


(4.12)  d = i-r<5^) 

✓ff  2 2 


(1,1) th  element  of  B = (m+n-2) (X-Y) ' S * (X-Y] 


i)1  = X’  S-1  (X-Y)  . 

This  completes  the  proof  of  (i) . 
(ii)  By  equation  (4.3) 


(4.13)  e(vl)  = 1 - e_  _ [PUu^)  > - 2(m^2yl  X,Y,S>] 

X,Y  ,S 

- 1 niB 

1 " EB.. lP{4 (m-llB.,  ^l-®!*  > " 2 (m+n-2)  (m-1) lBll}1 
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where 


(4.16) 

Therefore, 


-1  1/2 


k = [(m+n) t4n(m-l) } ] 


k~2  k^w  2 


KVX  = 1 - P[w>k‘2]  - | P(W<k*2)  - [d  / / (1-t  ) 

0 0 

ji  in^-gy-1)  r<j+|>> 


m+n- 5 


-1 


U+w) 


JH-1 

-j+m+n-r}  dt  dwl 


11  “ k k/w  2 — r- 

(4.17)  = i - ip(w>k  z)  - £ c(j)  / / (l-t  ) 2 

0 0 0 


f.j-1  -(SiEdUj) 

< (l*w>  2 - 


dt  dw 


where  the  constant  c(j)  depends  on\y  j,  m,  n and  p and  is  given 
by  Xj 


. i V j 

(4.18)  c(j)  = d{e  7 ( j)  3J  }r(5lnii)  {P(m+n-£-A)  r(j+E),  " . 


-1 


Thus,  the  theorem  is  proved. 

4.1.1  Numerical  Evaluation  of  E(e^) 

In  its  present  form.  Theorem  4.1,  (ii)  is  not  convenient  to 
evaluate.  In  this  sub-section  we  will  obtain  further  simplifications. 
We  will  consider  the  simple  case  of  6 = 0 in  detail.  For  6 = 0,  all 
of  the  terms  of  the  infinite  series  of  integrals  are  zero  except 


18 


the  first  term.  For  <5/0,  there  will  be  infinite  terms,  however 
due  to  the  coefficients  c(j),  the  series  will  be  a fast  converging 
series.  Moreover,  each  term  of  the  infinite  series  of  integrals 
can  be  evaluated  exactly  in  the  same  manner  as  the  terms  for 
<5  = 0.  Thus,  for  <5/0  details  are  omitted  and  only  the  numerical 
values  are  given. 

For  6=0,  (4.17)  gives 

k ^ k /w 

(4.19)  EV.  = i - i-P(w>k"2)-c(0)  I I 

* i 0 0 

0 m+n-5  p , m+n-1 

1 — 2 — 2-1  2 — 

(1-t  ) z w (1+w)  dt  dw 

-2 

Since  P(w>k  ) can  be  obtained  from  the  incomplete  beta  tables 
[reference  [13] ] , we  consider  the  integral  involved  in  the 
third  term.  We  will  reduce  this  double  integral  into  a sum  of 
single  integrals  and  the  later  are  evaluated  by  numerical 
integrations.  At  this  stage,  we  also  make  additional  simplifying 
assumption  m=n.  As  is  clear  from  the  following  development  that 
the  general  case  m/n  can  be  handled  exactly  in  the  same  manner. 

For  m=n,  the  integral  in  the  third  term  on  the  right  hand  side  of 
equation  (4.19)  is  given  by 

k-2  k/w  2 e-i  -izX 

(4.20)  c (0)  I ! (1-t  ) w (1+w)  dt  dw. 

0 0 

2 w 

We  make  the  transformations  t = u and  = y so  that  (4.20)  is 
equal  to 
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By  changing  the  order  of  integration  the  above  expression  becomes 

cl0)  i u+kV1  |-i  ^ _1/2 

— 2 — L / y (l-y)  (l-u)  u dy  du. 

0 2 i 

u (1+k  )_A 

If  p is  odd  then,  (2n-p-3)  is  even  and  integration  by  parts  can 
be  performed  to  evaluate  the  above  inner  integral  in  a recursive  manner. 
In  the  alternative  case,  i.e.  when  p is  even,  p/2-1  is  an  integer 
and  therefore  once  again  we  can  integrate  by  parts.  Thus,  (4.20) 
will  be  equal  to 
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(4.21) 


a! 


c ( 0)  r 

2 2 a+b+1 
1_u  (a-i)  ! { (b+1 ) . . . (b+l+i)  } (l+)t  ) 


2n-5 

ua-i-l/2  (1-u) 2_  (l+k2-u) 


b+l-i 


du 


k2(b+i+i)  , 

~ ->  j 


when  p is  even;  a = ^ - 1,  b = 2n-~P  - 


c (0) 
~T~ 


b 

I 

i=0 


b! 


a+b+1 


(b-i) ! (a+1) . . . (a+l+i) (1+k  ) 
k 


l 


1 _ . i , , . . 1 2n-5 

~ ,,  ..  ro,l.  T_1/2n-4  1 a+l+i— =■  — « — 

2ib-n  r<2>  r<-2 — > . / u 2(i-u>  2 


2 b“1 

(1+k  -u)  du} 


when  p is  odd, 


a = - 1,  b 


2n-p- 3 
2 


We  evaluate  the  integrals  involved  in  (4.21)  by  numerical 
integration.  Some  numerical  values  are  presented  below  in 
Tables  4.1,  4.2  and  4.3. 
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Table  4-1 


M 

2 

3 

4 

5 

3 

1.00 

.711 

.663 

.638 

.629 

5 

.881 

.713 

.667 

.642 

.625 

7 

.852 

.715 

.668 

.640 

.624 

Table  4.2 


tel. 


EH 

2 

3 

4 

5 

3 

1.0 

.739 

.687 

.653 

.640 

5 

.908 

.731 

.681 

.654 

.637 

7 

.868 

.728 

.679 

.652 

.636 

Table  4.3 


X 

2 

3 

4 

5 

3 

1.0 

.763 

.709 

.678 

.656 

5 

.934 

.747 

.695 

.666 

.653 

7 

.977 

.739 

.689 

.666 

.652 
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M. 


4.2  The  U-method: 


Another  estimator  of  p2^  is  obtained  by  the  'leave-one-out 

method'.  Suppose  the  unknown  parameters  u2  and  I are 

o -It 

estimated  by  X,  Y and  S = (m+n-3)  A {defined  in  the  proof  of 
Theorem  4.1)  and  the  left  out  observation  X^ , known  to  belong  to 

^ , is  used  for  testing.  Define 

r 1 if  X.  is  classified  as  a member  of 
(4.22)  V?  = / 1 

l 0 otherwise 


when  the  linear  classifier  (2.3)  is  employed.  This  process  is 

repeated  successively  by  leaving  out  X2  , . . . , X and  the 

corresponding  V*,...,V*  are  obtained.  Then,  an  estimate  of 
2 m 

p21'  g*ven  by  the  U-method  is 


(4.23) 


1 

m 


m 


2 

i=l 


V*  . 
l 


Since  X^ , *2,...,X  are  independent  and  identically 
distributed,  V£ ’ s will  also  be  identically  distributed,  although 
not  independently.  Thus, 

(4.24)  E(c4)  = E (V* ) 

1 o _ 0-1 

= PKXj  - ^(X+Y)}'  S (X-Y)  < 0)  . 

Lauchenbruch  and  Mickey  (19)  have  obtained  E(VJ)  by  the  Monto-Carlo 
method.  An  aoproximate  value  of  this  quantity  can  also  be 
obtained  by  the  following  expression  of  Okamoto  (1963) . 
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2 

where  a^'s  and  bVs  are  certain  constants  depending  upon  6 

2 

and  p,  but  independent  of  m and  n,  6 is  the  Mahalnobis  distance 

and  4>  ( • ) represents  the  distribution  fucntion  of  the  standard  normal 

random  variable.  In  table  4.4  below  some  numerical  values  are 

2 

given  for  selected  values  of  m,  n and  6 . Values  given  in  th.is 
table  agree  with  those  obtained  by  Lauchenbruch  and  Mickey  (1968). 

The  expected  values  of  E(e^)  and  Efe^)  obtained  by  the 
above  theoretical  considerations  agree  with  the  well  known  results 
obtained  from  empirical  studies  that  is  less  biased  than 
as  an  estimator  of  p21»  Thus  the  U-method  provides  a better 
estimator  of  p2^  than  the  C-method  [when  the  criterion  of 
comparison  is  bias] . From  the  point  of  view  of  calculations  the 
estimator  c 3#  given  by  the  C-method,  is  easier  to  evaluate  than 
e^,  although  convenience  of  evaluation  is  not  a meaningful  criterion 
in  view  of  the  availability  of  modern  computers. 
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Table  4..$ 


6=1 


\tf/p 
p \ 

— 

2 

3 

4 

5 

3 

0.6526 

0.6141 

0.6376 

0.6507 

0.6588 

5 

0.6505 

0.6242 

0.6399 

0.6506 

0.6578 

7 

0.6713 

0.6302 

0.6417 

0.6510 

0.6577 
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5. 


VARIANCE  oF  THE  ESTIMATOR  OF  PROBABILITY  OF  ERROR  BY  THE 


C-METHOD:  l IS  KNOWN. 

The  estimator  of  probability  of  error  on  the  design  set 
was  defined  by  in  Section  3 and  a method  of  evaluating  its 
expectation  was  described  there.  Although  the  expectation  of  e^ 
provides  us  information  regarding  the  unbiasedness,  knowledge 
of  the  variance  of  this  estimator  will  further  increase  our 
understanding  of  the  behavior  of  this  estimator.  Very  little 
attention  has  been  paid  to  the  variances  of  estimators  of  the 
probability  of  misclassif ication . 

In  this  section  we  consider  the  variance  of  e^  when  } is 

assumed  known.  Foley  (1972)  who  considered  E(e^)  for  this 

situation  also  obtained  an  approximate  expression  for  the 

m 

variance  of  e.  Recall  me^  = V T^  anc*  marginal  densities  of  T^ ’ s 
are  identical.  Foley  made  the  assumption  that  T.'s  are 
approximately  independently  distributed,  thus  me  is  a binomial 
random  variable  with  variance  of  e equal  to  E(e) (1-E(e))m  ^ . 

This  approximate  variance  has  an  upper  bound  (4m)  ^ . We  evaluate 
the  exact  variance  of  e^. 

By  definition  of  e^,  and  symmetry  in  the  distributions  of  T^ 1 s 


2b 


4 


1 m 

Var(e  ) = —*  {][  Var(T.)  4-  [ j cov(T.,T.)} 

1 in  1 1 i/i  1 3 

* -y  (m  VarfT^  + m(m-l)  covfT^Tj)} 
m 

(5.1) 

= i UET1-E2T11  + (m-1)  [E(T1.T2)  - E2T11) 


E(T^)  has  already  been  obtained  in  Section  3.  Thus  to  obtain  an 
exact  variance  of  g^,  it  remains  to  find  E(T1.T2>  which  is  obtained 
in  the  following  subsection. 

5.1.  Expression  of  E (T^  T2) 

We  have  already  made  the  assumption  that  J,  is  known.  Thus, 
without  loss  of  generality,  we  will  assume  that  $ - I,  the 
identity  matrix.  On  the  other  hand,  because  \ = I the  linear 
classifier  (2.1)  also  simplifies  and  an  observation  X is  classified 
to  Cx  (C2)  if 

(5.2)  (X  - |(X+Y)}'  (X-Y)  > 0 (<0)  . 

In  this  section  we  make  another  simplifying  assumption  of  m = n.  It 
can  be  seen  from  the  following  development  that  results  are  easily 
obtained  when  m / n in  exactly  a similar  manner. 

Following  the  notations  of  earlier  sections,  where  X|  denote 
observations  belonging  to  Cj  and  Y^'s  belong  to  £2 , and  using 
the  linear  classifier  (5.2)  we  obtain, 

E(Ti  T ) = FIX,  and  X2  are  both  classified  as  member  of<S2l 

= P({Xi  - i(X+Y)}'  (X-Y)  < 0,  i * 1,2]  . 
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By  a conditional  argument,  the  above  probability  can  also 
be  written  as 

(5.3)  E {PllX.  - i(X+Y)}'  (X-Y)  < 0,  i = 1,2|X,Y]}  . 

X,  Y 1 * 

First  we  evaluate  the  inner  term  of  (5.3)  i.e.,  the  conditional 
probability  given  X,  Y.  The  following  lemma  proves  useful  in 
this  evaluation.  Let 

(5.4)  IK  = {Xi  - |(X+Y)}'  (X-Y),  i = 1,2  . 

Lemma  5.1.  The  joint  distribution  of  (U^,!^)  given  X = x and 
Y = y is  a bivariate  normal  with  mean  vector  1/2 (a, a)  and  the 
covariance  matrix 


a 

n 


(5.5)  where 


n-1 

-1 

-1 

n-1 

a = a (x,y)  = (x-y)'  (x-y)  . 


Proof : Let  X denote  the  mean  of  the  (n-2)  observations 

A 

X^,...,X  . Then  X^ , X2  and  X are  statistically  independent, 
normally  distributed  with  the  same  mean  u and  the  covariance  matrices 
I,  I and  (n-2)  1 I respectively.  To  obtain  the  joint  distribution 

A 

of  X^,  X2  and  X from  the  joint  distribution  of  X^,  X2,  X we 
apply  the  transformation 


X = N-1(X1+X2+(n-2)X) 
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The  joint  density  of  X^,  X2,  X is  given  by 

_P  -IP 

nP//2(n-2)  (211)  exp-jb^-Uj)  ' (x^Uj)  + (x2~y1)  ' (x2~w1) 

+ (n-2){(n-2)  1 (nx-x^-x^-^ } ' { (n-2)  1 (nx-x^Xj)  ] . 

Since  X is  a p-variate  gaussian  random  variable,  the 
conditional  distribution  of  9*ven  x is  obtained  by  dividing 

the  above  expression  by  the  pdf  of  X.  This  conditional  pdf  of 
X^,X2  simplyfies  to 

£ 1 

f(xlfx2|x)  = (2H)p(^j)  exp-  (n-l)  {x|Xi+x^x2  > + 2xjx2 


+2n  x'x  - 2nx'  (x^+x2>] 


|2">P  L1 


X.  - X 


where 


A = I 
n 


(n-l)I  -I 


-I  (n-l)I 


Thus  the  conditional  pdf  of  ( X * , X^ ) ' given  X is  a 2p-variate 
gaussian  with  mean  vector  (x'jx')'  and  covariance  A.  Next,  if 
Z^  = X!  (X-Y) ; i = 1,2,  then  the  conditional  density  of  (Z^,Z2)* 
given  X,  Y is  a bivariate  normal  with 


x ' (x-y) 


x'  (x-y) 


and  cov 


■« 


n-1  -1 


n-lj 
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Finally,  if 


Ui  “ <xi  ” ~j~  ) (*“?)»  1 “ l'2' 

then  the  joint  density  of  given  X = x,  Y-  y is  also  a 

bivariate  normal  with  mean  vector  ^(a,o)'  and  with  the  same 
covariance  matrix  as  of  Z's.  This  completes  the  proof <$the  lemma. 

From  (5.3)  EfV^)  = E(P (U1<0 ,U2<0  |X,Y)  ) where  l^'s  are 
defined  by  (5.4)  and  using  Lemma  4 . 1 we  get 

P(U,<0,U  <0|X,Y)  = / / k(a)  exp~2  (Uj-f ) 2 + (u2_?)  2 

A —00  —00 

♦ <vf>  ,dui  d“2 
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In  the  second  expression  above  we  use  the  symmetry  of  the  integrand 

to  change  the  integral  from  the  third  quadrant  to  the  first 

quadrant.  To  get  the  third  equality  above,  first  we  use  the 

symmetry  of  the  integrand  around  the  line  t^  = t 2 and  then 

change  to  polar  coordinate  system. 

Thus  to  evaluate  E (T^  it  remains  to  take  the  expectation  of 

(5.6)  with  respect  to  X,  Y.  Since  (5.6)  depends  on  X,  Y only 

through  a,  we  take  the  expectation  with  respect  to  o.  But  (n/2)a 

is  a noncentral  chisquare  random  variable  with  p degrees  of 

n 2 2 

freedom  and  with  noncentrality  parameter  ^6  where  5 = (u^-i^)  ' 

Therefore, 

E<viV  ' 1 1 <J0  rr  ‘r*2’8  2 <"*2>  r 1 <«<-?> 


. na.  ,no.  s+5-1  /n (n-2)  ,^r  1 

(exp — j)  (-^)  } i — / [exp-^ 

0 

d0  i ^ 

(n-l+sln  m*  da‘ 


(n-l+sin  2 ) . 

-—j  a] 

(n-2)sm  0 


Due  to  convergence  of  the  above  integral,  the  order  of  integration 
can  be  interchanged.  Interchanging  the  order  of  integration  and 
then  integrating  over  a produces 


/T  { l exp(-JS£)|T(JS2)S2  VaT2'  r'1(s+|)/^J-2) 


E(V1V2) 


-(s+l>  _x 


0 s=0 


-(s+|) 

r>/  ^p»ri.  l,n-l+8in  20  , . d0 

r,S^  IU^(,„-2  ,i„‘  8 ' <■-»«!»  " 


(5.7) 


= / exp-^{ 


n-l+sin  20 


5 

2n(n-2)sin  0+(n-l)+sin  20 


-> 


fl  . 1 n-l+sin  20,"2  d0 

1 2n  " ox  . 2q  i (n-1 ) +sin  20  - 

(n-2)sm  0 


Expression  (5.7)  involves  integration  over  one  variable  and 
to  evaluate  it  we  use  numerical  integration.  The  graph  of  var(c^) 
is  also  presented  for  various  values  of  N/p. 

5.2.  Some  Numerical  Values  of  Var(e) 

As  shown  earlier  [see  expression  (5.1)]  to  evaluate  var(e^) 
we  needed  to  know  E(T^)  and  E (T^  T2) . To  evaluate  ET^  we  use 
equation (i. 5)  and  E(T^T2)  can  be  obtained  by  numerical  integration 
of  (5.7).  Thus  var(e^)  can  be  calculated  exactly.  In  Table  5.1 
we  present  these  values  for  some  choices  of  n and  p.  The  values 
in  Table  5.1  show  an  interesting  feature,  namely  that  for  fixed 
value  of  n if  p increases  then  var(e^)  decreases.  Thus,  for 
fixed  n although  the  bias  in  increases,  the  variance  of 
decreases  as  p increases. 

A computer  program  which  evaluates  the  var(c^)is  given  in 
the  Appendix. 
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flRIRNCE  PLOT  FOR  S =1 


oo  4.00  b’.oo  a!on  10.00  12.00 

n/p 


VflBI 


var  e-^ 


N 


p 

5 

10 

15 

20 

25 

30 

35 

3 

.02436 

.01343 

.00931 

.00713 

.00577 

.00485 

.00418 

5 

.02158 

.01243 

.00876 

.00678 

.00553 

.00467 

.00405 

10 

.01625 

.01069 

.00781 

.00616 

.00509 

.00434 

.00379 

= Q 

40 

.00367 

.00357 

.00336 
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. VARIANCE  OF  THE  ESTIMATE  OF  PROBABILITY  OF  ERROR  BASED  ON 
THE  U-METHOD. 

In  this  section  an  approximate  value  of  variance  of  the 
estimate  of  the  probability  of  error  of  misclassif ication  is 
obtained  when  £ is  assumed  known.  This  estimator  e2  was  defined 
in  (3.6). 

(6.1)  Var(e2)  = ^( (ET*-E2 (Tj)  } + (m-1) (E (T*T*)  - E2(T*)}]  . 

E(T*)  nas  already  been  evaluated  [see  equation  (3.8)].  Thus,  once 
again,  to  calculate  the  varft^)  it  remains  to  evaluate  E(T*T*). 

6.1.  Expression  for  E(T*  T*) 

By  definition 

| 1 if  {X1-J(X+¥) }' (&-Y)  < 0 

(6.2)  TJ  = < 

I 0 otherwise 

and  a similar  expression  holds  for  T*.  At  this  stage  let 
XU)  = (ra-l)"i[mX-X.]  = (m-1) _1  l X. 

1 j*i  3 

Then 

E(T*T*)  = P[{Xi-|(X(l)+Y) }'  (X(l,-Y)  < 0;  i = 1,2] 

(6.3)  = P({X.-^[(m-l)“1(mX-Xi)+Y] }' {(m-l)"1(mX-Xi)-Y}  < 0, 

i = 1,2]  . 


By  a sequence  of  arguments  presented  below  the  events  of 
interest  in  (6.3)  can  be  written  in  a convenient  form.  Note 
that  for  i = 1,2, 


(Xi  - j[(m-l)"1(mX-Xi)+Y) }• {(m-l)_i(mX-Xi)-Y}  < 0 


-1 , r. 


iff  'ar^rr  xi  - 5 ♦ *>!’  1-srr  \ * <s?r  X-*>J  * 0 


2m- 1 


iff  [ ifS X’X  + X-  {(-^t-)2  X-Y}  4{(^TT)2  X'X  - Y’Y} 

2 (m-1) ^ 1 1 1 m 1 2 m_1 


< 0 


2 (m-1) 


m ,2  - 


2 


iff  [x;x,  - Kjffj.)-  x-5)'  Xj  t {(J,.)-  x' x - y’y)  > o 


ii 


2m- 1 


2 2 

(6.4)  iff  [Z  ! Z . > - (m~1I  {X-Y } ' {X-Y } 

1 1 (2m-ir 


2 

where  ZL  = XL  - jj^l  ) < (j^Ti> 2 X-Y).  Therefore  by 

(6.4)  and  using  a conditional  argument,  we  obtain 

2 

(6.5)  E(T*,T*)  = E[PtZ^Zi  > (SiEzil)  (x-Y) • (X-Y); 


(6.3)  and 


i = 1 , 2 | X , Y } ] 


In  the  proof  of  Lemma  5.1  we  have  shown  that  the  joint  distribution 
of  (X^X^)'  given  X is  a 2p-dimensional  gaussian  distribution  with 
mean  vector  (X',X')'  and  covariance  matrix  A.  Therefore, 
conditional  distribution,  given  X,  Y,  of  (Z|,Z£) ' is  also  a 

2p-dimensional  gaussian  with  mean  vector 

\ (Y  - X)  1 


(m-1)2 


(Y  - X) 


and  covariance  matrix  A.  From  the  above  development,  it  is 

obvious  that  Z^  and  Z 2 are  dependent  random  variable  and  the 

marginal  distributions  of  Z!Z.  are  noncentral  chisquares  each 

m-i  1 1 

with  p degrees  of  freedom  and  each  with  noncentrality  parameters 
m(m-l)3  2m-l)~2  (X-Y) ' (X-Y) ; given  X and  Y. 

If  we  make  the  approximation  of  treating  Z's  as  independent 
random  variables,  then  to  this  degree  of  approximation,  (which 
is  expected  to  be  small  when  m is  large,  because  covariance 
between  Z's  is  - (m-1)  3IJ, 

2 

(6.6)  E(T*T*)  = E(P2(Z^Z1  > a (X,  Y)  | a (X,Y)  }] 

where  a(X,Y)  is  defined  in  (5.5).  As  seen  in  the  last  section, 
^a(X,Y)  itself  follows  a chisquare  distribution,  with  p degrees 
of  freedom  and  noncentrality  parameter  ' ^l_u2^  = ® ’ 

Usinq  these  distributional  properties,  further  simplification 
of  (6.6)  is  presented  below. 
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4 


Set  c 


and 


Then 


m (m-1) 
(2m-l) 2 


m (ro-l) 
(2m-l) 


2 ' 


P[Z1Z1  ” ] ® (X, Y)  | a (X,Y)J 


„ _aa  -*  El|l 

T o /da\'3  1 r°°  e x 

3-0  * L — 1SI  dx 

(E±2i)2^ 


2 

.9(a)  = P2[Z|Z1  > (!4S^T")  a (X,  Y)  | a (X,Y) ] 


I e 

j,k=" 


1 2 •*■  2 
j+k  . « co  -7(x+y) 

1 2 ' -ilk!  ; ' 


E±2j_i  Ei^.! 
X y 


]lk!  C1“  r(E±2i)  r(Ei^i)  p+j+k 


”2  ' 2 


By  changing  to  the  polar  coordinates,  i.e., 

x = r cos0 
y = r sine 

we  obtain 


J»0 


jL 


dxdy  • 


3«»  . I C e-dV+k  r* 


2k+2  x 

f f {sin0  cos0  2 + 

n co/cos0 


2i+p. 


j / Jt=0 


2klP_!  ^fP-l  -|(cos0+8in0)  j+k+  x 
+ cos0  sin0  J e r 


dr  d0] 


co  I ^2-1  2k+p  2j+p. 

J c e-daa3+k  r sin0  2 cos0  2 + cos0  2 1 sin0  2 

j ,k=0  0 


{( 


j+k+p 


sin0+cos0 


COL  • 

j+KfP  1 e 5“(1+tane)  [£5.(i+tan0)  ] }d0 

i^o  r (i+1)  2 


where  C = (xO 


j+k 


j ! k ! 


p+j+k 


Finally, 


E(T*T*) 


■ E(3(a)) 

oo  <®  j+k+p- 1 ° 

■II  I C*  / 

1=0  j,k=0  i=0  0 


it  m ^P-l  (l+tan0) 

.T,  ~'ia  ,ma  e 

I le  ( — r ) 

0 4 


where  C*  = 
and 

q(0)  - 


-da  j+k+i 
e aJ 


q (0 ) 1 d0  d(-Ja) 

62 
"7 


C2 


j+k+p  Wp)  ~j  /62  l |Cli 


lMi+1) 


<T~>  jj  <*> 


ftlA+E) 


^S+P-l  li+P-i  2k+p-l  iiiP-i 

tsin0  2 cos0  2 + cos0  sim0  } 

{sin0+cos0}“(3+k+p) {l+tanO}1  . 


U1 


Interchanging  the  order  and  then  integrating  over  o,  in  the 
above  expression  produces, 


(6.7) 


* j+k+p-1  x 

E(V*V*)  - [ l C**  f q* (0)d0 

1 * fc,  j ,k=0  i*0  0 


where 


*+? 

r**  = c*  (J)  f (t+j+k+i+|) 


and 


m c -(§+*+j+k+i) 

q* (0 ) = q(0)  [^+d+|(l+tan0) 1 


This  last  integral  can  be  evaluated  numerically. 

At  the  present  time  we  have  not  been  able  to  calculate  the 
value  of  E(T*T*)  even  with  the  help  of  a computer.  Therefore, 
the  numberical  values  and  a comparison  of  the  two  estimates  of  error 
of  probability  will  be  presented  in  a forthcoming  technical  report. 
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APPENDIX 


The  following  computer  program  provides  the  expectation  and 
the  variance  of  the  estimate  of  the  error  of  probability  on  the 
design  set.  A plotting  algorithm  is  also  attached. 


MAIN 


COMPUTE  THE  MEAN  AND  VARIANCE  OF  THE  PROBA.  OF  MISCLASSIFICATION 
EXTERNAL  F2 

COMMON/ CONS 1/ C,D,AN,IP,I ,J ,K ,L ,JK ,HP1 ,D1  ,DD 
COMMON /GAMA/GI 1 ,GJ1 ,GK1 ,GL1  ,GJHP  ,GKHP  ,GJKP  ,GI JKL 
DATA  Pl4, AERR.RERR, EPS/. 785398l634EO,l. 01-6, 1. OE-U,  1 .OE-1/ 

DATA  ZERO .ONE  , TWO/O .0E0 , 1 . 0E0 ,2 . 0E0/ 

DO  600  ID-1,1 

IDS-ID-1 
WRITE( 6,920) 

DO  500  11-1,1 

READ(5,900)  IP 
HP-FL0AT( IP) /TWO 
HP1-HP-ONE 
DO  U80  12-1,1 

IN-12*5 

AN- FLOAT ( IN ) /TWO /TWO 
D1-FL0AT(IDS»AN 
AN 1-TWO *11 -ONE 
AN2-FL0AT( IN«( IN-1) ) 

AN  3- Dl* AN 2 /TWO 
ANU-ONE/BQRT( AN1) 

AN  5-ONE/SQRT( AN 1+TW0*TWQ*AN2 ) 

RHO--AN4*AN5 

AC-ONE-RHO ) /TWO 

AL1- AN  3*( AN  4-AN5 ) **2 / ( ONE+RHO ) 

AL2- AN  3*(ANU4AH5)**2/( ONE-RHO ) 

T1-EXP( -AL1- AL2 ) 

T2-0NE 

EV1-ZER0 

DO  300  IP-1,21 

AA-HP1+IR-0NE 

T3-ONE 

EVO-ZERO 

DO  200  IS-1,21 

AB-HP1+IS-0NE 

CALL  MDBETA( AC , AA , AB ,T4 . IER ) 

EV- ( ONE-TU ) *T2*T3 


A-l 


MAIN 


EVO-EVO+EV 

IF(EV.LE.l.OE-Io)  GO  TO  210 

T3-T3*AL2/IS 

CONTINUE 

T2»T2»AL1/IR 

EV1-EV1+EVO 

IF(EVO.LE.l.OE-b)  GO  TO  310 

CONTINUE 

EV1-EV1*T1 

R=  AN  2 / AN 1**2 

C»tt*IN*IN?TWO 

D*b89in-1)*(IN-1) 

CC*  C/TWO 

CSS-FXP(-Dl)*AN**HP 

TLL*9 

IF( IDS.EQ.O)  ILL-1 
EV1V2-ZER0 
DO  450  IL-1,111 

L-IL-1 

GL1-GAMMA( float(  il) ) 

FVV1-ZER0 

DO  440  IJ-1,11 

J-IJ-1 

GJ1-GAMMA(FL0AT(IJ) ) 

GJHP-GAMMA( J + HP^ 

EVVO-ZERO 
DO  420  IK-1,11 
K = IK-1 

GK1=GAMMA( FLOAT( IK) ) 

GKKP=GAMMA(K+HP) 

JK-J+K 

JKFJK+IP 

GJKP=GAMMA( FLOAT ( JKP ) ) 

EVV-ZERO 

DO  400  I I -l , JKP 
I-II-l 

GI 1-GAMMA( FLOAT(  II ) ) 

GI JKL=GAMMA(  I-fJK  + L+HP  ) 

ETEMP-CSS*DCADRE(F2,EPS  ,PI4  ,AERR  , RERR  .ERROR  ,IER) 
EVV-EVV+ETEMP 

IF ( ETEMP . LE . AERR ) GO  TO  410 

CONTINUE 

EVVO-EVVO+EVV 

IF(EVV.LE.a.0E-6)  GO  TO  430 

CONTINUE 

EVV1-EVVH-EVV0 

IF(EVVO.LE. 5.0E-5)  GC  TO  445 
CONTINUE 


A -2 


MAIN 


VARE-(EVl-EVl*EVl-*-(lN-l)*(EVlV2-EVl»EVl)  )/IN 

WPITE( 6,950)  EV1,EV1V2,VARE,IP,IN  .IDS 

CONTINUE 

CONTINUE 

FORMAT ( 315) 

FORMAT(  • IP^'  ,2(I3,2X),U(E1J*.7,2X)) 

FORM AT ( 7X, 'EV1' ,11X, • EV1V2 ' ,12X 'VARE' , 8X , • IP ' , 3X , • IN ' ,2X  , ' IDS  ' ) 
FORMAT (2X, 'EVV1- ’ ,2 ( El  5 . 3 , 2X ) , 3 ( 1 5 , 2X ) ) 

FORMAT ( 2X,3(EiU.7,2X)  ,3(I3,2X)) 

STOP 

END 
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COMPUTE  THE  EXPECTED  VALUE  AND  VARIANCE  OF  THE  PROBA.  OK 
ERROR  ON  THE  DESIGN  SET 

E( VI ) =SUM(EXP( -LAMDA#*2/2 . )*(  1 . /FACT  R ) ) * ( LAMDA**2 /2 . ) *• 
*GAMMA(R*L/2.-*-)0.50)*I(L*2R,N  )/GAMMA(  0.50) /GAMMA  L/2 
SUM  OVER  R = 0 TO  INFINITY 
INPUT  IL.IN.IDl 

DCADRE  IS  AN  IMSL  LIBRARY  FUNCTION  WHICH  INTEGRATE  E,  K ) 
USING  CAUTIOUS  ADAPTIVE  ROMBERG  EXTRAPOLATION 
MDBETA  IS  AN  IMSL  LIBRARY  SUBROUTINE  WHICH  DOES  INCOMPLETE  B 
PROBABILITY  DISTRIBUTION  FUNCTION  INTEGRATION 


0001 

0002 

0003 

0004 

0005 

0006 
OOOT 
0008 

0009 

0010 
0011 
0012 

0013 

0014 

0015 
001b 

0017 

0018 

0019 

0020 
0021 
0022 

0023 

0024 
*025 
0026 
0027 
002b 

0029 

0030 

0031 

0032 

0033 

0034 

0035 

0036 

0037 

0038 

0039 

0040 

0041 

0042 
FORTRAN 


150 


200 


400 


IV  G 


EXTERNAL  FI 

COMMON  /INPUT1/IL.IN ,ID1 

DATA  A.AERR  ,RERR .PI4/1.0E-4 ,1 . OE-6 , 1 . OE-4  , . 7853981634EO/ 
DATA  IPEN  ,IFLAG .ICOMNT ,NI/1,0,1,7/ 

DATA  HALF .ONE .TWO/ . 5E0 , 1 . 0E& , 2 . OEO/ 

REAL  XLABL  19  ) , YLABI-  19  ) ,TITL(  19  ) , ABSC(  7 ) ,ORD  V) 

REAL  X(4)/0. 0.12.0, 6. O.O.O/.Y  4 ) /O  . 0 , 0 . 025 . 5 . 0 , 0 . 0/ 

CALL  PLOTID 
CALL  PLOT  0.0, 2. 5, -3) 

READ( 5 ,900 ) NXL, XLABL, NYL.YLABL 
DO  600  ID=1,3 

ID1=ID-1 
WRITE  (6,920) 

READ  (5,900)  NTL.TITL 

CALL  GRAPHS ( X ,Y  .XLABL ,NXL , YLABL ,NYL ,TITL ,NTL ) 

DO  500  11=1,3 

ISYMB-I1 
READ  (6,910)  IL 
DO  400  12=1,7 

IN»I2*5 

D1*FL0AT( IN*ID1 ) /TWO /TWO 
T1=EXP( -D1 ) /TWO 
AN1= ( TWO*IN-TWO ) / ( TWO* IN-ONE  ) 

IR-0 

EVO*0. OEO 
T2-0NE 

S=FLOAT ( IL ) /TWO+FLOAT( IR ) 

CALL  MDBET  AN1 ,S .HALF ,T3  ,IER ) 

EVR=T2»T3 
EVO-EVO+EVR 

IF  EVR.LE.AERR)  GO  TO  2 00 
IR  = IR-H 

T2=T2*D1/FL0AT(IR) 

GO  TO  150 
EV1»EV0*T1 

EV1 V2» DCADRE ( FI , API  4 , AERR ,RERR .ERROR  ,IER) 
VARE=(EV1-EV1#EV1+(IN-1)*(EV1V2-EV1*EV1) )/IN 
ORD( 12 ) =VARE 

ABSC( I2)*FL0AT( IN ) /FLOAT  IL) 

WRITE(6,940)  IL,IN,ID1,ABSC(I2) ,0RD(I2) ,EV1,EV1V2 
CALL  DATA? ( ABSC ,ORD ,NI ,IPEN  ,IS7MB  ,IFLAG  .ICOMNT) 

CALL  GRAPHS (X  ,Y, XLABL, NXL .YLABL ,NYL ,TITL ,NTL ) 

LEVEL  21  MAIN  DATE  = 76014 


1*3  500  CALL  DATA?(ABSC,ORD,NI,IPEN,ISYMB,IFLAC,ICOMNT) 

UU  600  CALL  PLOT  ( 10. 0,0. 0,-3) 

»*5  CALL  FRAME? 

1*6  900  F0RMAT(lU,19A»*) 

1*7  910  FORMAT  (S) 

U8  920  F0RMAT( / ,2X, ’IL' ,3X, ’IN ' ,3X,  ’IDl ' , 8X  , ' N /L ’ , 1 3X  , ’ VARE • ,12X, 

* ' EV1' ,12X,'EV1V2'  ,/) 

V9  9>»0  FORMAT  (X,3(I3,2X)  ,1*  ElU.7-.2X)) 

50  1000  STOP 
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